Quantifying Query Ambiguity

نویسندگان

  • Steve Cronen-Townsend
  • Bruce Croft
چکیده

We develop a measure of a query with respect to a collection of documents with the aim of quantifying the query’s ambiguity with respect to those documents. This measure, the clarity score, is the relative entropy between a query language model and the corresponding collection language model. We substantiate that the clarity score measures the coherence and specificity of the language used in documents likely to satisfy the query. We also argue that it provides a suitable quantification of the (lack of) ambiguity of a query with respect to a collection of documents and has potential applications throughout the field of information retrieval. In particular, the clarity score is shown to correlate positively with average precision in evaluations using TREC test collections. Hence, as one example, the clarity score could serve as a predictor of query performance. Systems would then be able to identify vague information requests and respond differently than they would to clear and specific requests.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query Ambiguity Revisited: Clickthrough Measures for Distinguishing Informational and Ambiguous Queries

Understanding query ambiguity in web search remains an important open problem. In this paper we reexamine query ambiguity by analyzing the result clickthrough data. Previously proposed clickthrough-based metrics of query ambiguity tend to conflate informational and ambiguous queries. To distinguish between these query classes, we introduce novel metrics based on the entropy of the click distrib...

متن کامل

Quantifying Metrical Ambiguity

This paper explores how data generated by meter induction models may be recycled to quantify metrical ambiguity, which is calculated by measuring the dispersion of metrical induction strengths across a population of possible meters. A measure of dispersion commonly used in economics to measure income inequality, the Gini coefficient, is introduced for this purpose. The value of this metric as a...

متن کامل

The Effects of Information Request Ambiguity and Construct Incongruence on Query Development

This paper examines the effects of information request ambiguity and construct incongruence on end user’s ability to develop SQL queries with an interactive relational database query language. In this experiment, ambiguity in information requests adversely affected accuracy and efficiency. Incongruities among the information request, the query syntax, and the data representation adversely affec...

متن کامل

Query Ambiguity Identification Based on User Behavior Information

Query ambiguity identification is of vital importance for Web search related studies such as personalized search or diversified ranking. Different from existing solutions which usually require a supervised topic classification process, we propose a query ambiguity identification framework which takes user behavior features collected from click-through logs into consideration. Especially, beside...

متن کامل

On the Origin of Ambiguity in Efficient Communication

We investigate both the locus of ambiguity in the architecture of language and the origin of ambiguity in natural communication. We 1) locate ambiguity at the externalization branch of language, 2) provide a rigorous, general definition of ambiguity through the concept of logical irreversibility, quantifying the amount of ambiguity within the framework of Shannon’s information theory, and 3) pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002